Dimensionality and data reduction in telecom churn prediction

نویسندگان

  • Wei-Chao Lin
  • Chih-Fong Tsai
  • Shih-Wen Ke
چکیده

Purpose – Churn prediction is a very important task for successful customer relationship management. In general, churn prediction can be achieved by many data mining techniques. However, during data mining, dimensionality reduction (or feature selection) and data reduction are the two important data preprocessing steps. In particular, the aims of feature selection and data reduction are to filter out irrelevant features and noisy data samples, respectively. The purpose of this paper, performing these data preprocessing tasks, is to make the mining algorithm produce good quality mining results. Design/methodology/approach – Based on a real telecom customer churn data set, seven different preprocessed data sets based on performing feature selection and data reduction by different priorities are used to train the artificial neural network as the churn prediction model. Findings – The results show that performing data reduction first by self-organizing maps and feature selection second by principal component analysis can allow the prediction model to provide the highest prediction accuracy. In addition, this priority allows the prediction model for more efficient learning since 66 and 62 percent of the original features and data samples are reduced, respectively. Originality/value – The contribution of this paper is to understand the better procedure of performing the two important data preprocessing steps for telecom churn prediction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Customer Behavior Mining Framework (CBMF) using clustering and classification techniques

The present study proposes a Customer Behavior Mining Framework on the basis of data mining techniques in a telecom company. This framework takes into account the customers’ behavior patterns and predicts the way they may act in the future. Firstly, clustering technique is used to implement portfolio analysis and previous customers are divided based on socio-demographic features using k</em...

متن کامل

Churn prediction in telecom using Random Forest and PSO based data balancing in combination with various feature selection strategies

The telecommunication industry faces fierce competition to retain customers, and therefore requires an efficient churn prediction model to monitor the customer’s churn. Enormous size, high dimensionality and imbalanced nature of telecommunication datasets are main hurdles in attaining the desired performance for churn prediction. In this study, we investigate the significance of a Particle Swar...

متن کامل

Application of Feature Extraction Method in Customer Churn Prediction Based on Random Forest and Transduction

With the development of telecom business, customer churn prediction becomes more and more important. An outstanding issue in customer churn prediction is high dimensional problem. Curse of dimensionality will easily occur if effective feature extraction is not applied during modeling. Among the most popular feature extraction approaches, principal component analysis (PCA) method based on induct...

متن کامل

Customers Churn Prediction and Attribute Selection in Telecom Industry Using Kernelized Extreme Learning Machine and Bat Algorithms

With the fast development of digital systems and concomitant information technologies, there is certainly an incipient spirit in the extensive overall economy to put together digital Customer Relationship Management (CRM) systems. This slanting is further more palpable in the telecommunications industry, in which businesses turn out to be increasingly digitalized. Customer churn prediction is a...

متن کامل

Social Network Analysis for Churn Prediction in Telecom Data

Social Network Analysis (SNA) is a set of research procedures for identifying group of people who share common structures in systems based on the relations among actors. Grounded in graph and system theories, this approach has proven to be powerful measures for studying networks in various industries like Telecommunication, banking, physics and social world, including on the web. Since Telecomm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Kybernetes

دوره 43  شماره 

صفحات  -

تاریخ انتشار 2014